PERCENTILEOFSCORE
Overview
The PERCENTILEOFSCORE function computes the percentile rank of a given score relative to a reference dataset. A percentile rank indicates what percentage of values in the dataset fall below the specified score. For example, a percentile rank of 80 means that 80% of the values in the dataset are below the given score.
This function wraps scipy.stats.percentileofscore from the SciPy library. The underlying implementation is available in the SciPy GitHub repository.
Percentile ranks are commonly used in educational testing, standardized assessments, and statistical analysis to interpret scores relative to a norm group. Unlike percentiles (which identify the score at a given percentage), percentile rank identifies the percentage corresponding to a given score.
The function supports four different methods for handling ties and calculating the rank, specified via the pos_method parameter:
- rank (default): Computes the average percentage ranking. When multiple values match the score, the percentage rankings of all matching scores are averaged.
- weak: Corresponds to the cumulative distribution function (CDF) interpretation. The result represents the percentage of values less than or equal to the score.
- strict: Counts only values strictly less than the given score.
- mean: Returns the average of the “weak” and “strict” calculations, which aligns with the classic percentile rank formula:
PR = \frac{CF' + 0.5 \times F}{N} \times 100
where CF' is the count of values below the score, F is the frequency of the score, and N is the total number of values.
The function returns a value between 0 and 100 representing the percentile rank.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=PERCENTILEOFSCORE(data, score, pos_method)
data(list[list], required): Reference data array of numeric values.score(float, required): The score value to calculate the percentile rank for.pos_method(str, optional, default: “rank”): Ranking method for handling ties.
Returns (list[list]): 2D list containing the percentile rank (0-100), or error message string.
Examples
Example 1: Basic rank method at 75th percentile
Inputs:
| data | score | pos_method |
|---|---|---|
| 1 | 3 | rank |
| 2 | ||
| 3 | ||
| 4 |
Excel formula:
=PERCENTILEOFSCORE({1;2;3;4}, 3, "rank")
Expected output:
| Result |
|---|
| 75 |
Example 2: Tied scores with mean method
Inputs:
| data | score | pos_method |
|---|---|---|
| 1 | 3 | mean |
| 2 | ||
| 3 | ||
| 3 | ||
| 4 |
Excel formula:
=PERCENTILEOFSCORE({1;2;3;3;4}, 3, "mean")
Expected output:
| Result |
|---|
| 60 |
Example 3: Strict method excludes ties
Inputs:
| data | score | pos_method |
|---|---|---|
| 1 | 3 | strict |
| 2 | ||
| 3 | ||
| 3 | ||
| 4 |
Excel formula:
=PERCENTILEOFSCORE({1;2;3;3;4}, 3, "strict")
Expected output:
| Result |
|---|
| 40 |
Example 4: Weak method includes ties
Inputs:
| data | score | pos_method |
|---|---|---|
| 1 | 3 | weak |
| 2 | ||
| 3 | ||
| 3 | ||
| 4 |
Excel formula:
=PERCENTILEOFSCORE({1;2;3;3;4}, 3, "weak")
Expected output:
| Result |
|---|
| 80 |
Python Code
from scipy.stats import percentileofscore as scipy_percentileofscore
def percentileofscore(data, score, pos_method='rank'):
"""
Computes the percentile rank of a score relative to the input data.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.percentileofscore.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): Reference data array of numeric values.
score (float): The score value to calculate the percentile rank for.
pos_method (str, optional): Ranking method for handling ties. Valid options: Rank, Weak, Strict, Mean. Default is 'rank'.
Returns:
list[list]: 2D list containing the percentile rank (0-100), or error message string.
"""
def to2d(x):
return [[x]] if not isinstance(x, list) else x
# Normalize input to 2D list
data = to2d(data)
# Flatten 2D list to 1D
arr = []
for row in data:
if isinstance(row, list):
arr.extend(row)
else:
arr.append(row)
# Convert to floats
try:
arr = [float(x) for x in arr]
except (TypeError, ValueError):
return "Invalid input: data must be numeric."
if len(arr) == 0:
return "Invalid input: data must not be empty."
try:
s = float(score)
except (TypeError, ValueError):
return "Invalid input: score must be a number."
valid_methods = ('rank', 'weak', 'strict', 'mean')
if pos_method not in valid_methods:
return f"Invalid input: pos_method must be one of {valid_methods}."
try:
result = scipy_percentileofscore(arr, s, kind=pos_method)
except Exception as e:
return f"scipy.stats.percentileofscore error: {e}"
# Disallow nan/inf
if result is None or (isinstance(result, float) and (result != result or abs(result) == float('inf'))):
return "Invalid result: output is nan or inf."
return [[float(result)]]